27 research outputs found

    LENTA: Longitudinal Exploration for Network Traffic Analysis from Passive Data

    Get PDF
    In this work, we present LENTA (Longitudinal Exploration for Network Traffic Analysis), a system that supports the network analysts in the identification of traffic generated by services and applications running on the web. In the case of URLs observed in operative network, LENTA simplifies the analyst’s job by letting her observe few hundreds of clusters instead of the original hundred thousands of single URLs. We implement a self-learning methodology, where the system grows its knowledge, which is used in turn to automatically associate traffic to previously observed services, and identify new traffic generated by possibly suspicious applications. This approach lets the analysts easily observe changes in network traffic, identify new services, and unexpected activities. We follow a data-driven approach and run LENTA on traces collected both in ISP networks and directly on hosts via proxies. We analyze traffic in batches of 24-hours worth of traffic. Big data solutions are used to enable horizontal scalability and meet performance requirements. We show that LENTA allows the analyst to clearly understand which services are running on their network, possibly highlighting malicious traffic and changes over time, greatly simplifying the view and understanding of the network traffic

    LENTA: Longitudinal Exploration for Network Traffic Analysis

    Get PDF
    In this work, we present LENTA (Longitudinal Exploration for Network Traffic Analysis), a system that supports the network analysts to easily identify traffic generated by services and applications running on the web, being them benign or possibly malicious. First, LENTA simplifies analysts' job by letting them observe few hundreds of clusters instead of the original hundred thousands of single URLs. Second, it implements a self-learning methodology, where a semi-supervised approach lets the system grow its knowledge, which is used in turn to automatically associate traffic to previously observed services and identify new traffic generated by possibly suspicious applications. This lets the analysts easily observe changes in the traffic, like the birth of new services, or unexpected activities. We follow a data driven approach, running LENTA on real data. Traffic is analyzed in batches of 24-hour worth of traffic. We show that LENTA allows the analyst to easily understand which services are running on their network, highlights malicious traffic and changes over time, greatly simplifying the view and understanding of the traffic

    A method for exploring traffic passive traces and grouping similar urls

    Get PDF
    Computer security method for the analysis of passive traces of HTTP and HTTPS traffic on the Internet, with extraction and grouping of similar Web transactions automatically generated by malware, malicious services, unsolicited advertising or other, comprises at least the following processing and control steps: a) URLs extraction from an operational network, using passive exploration of the HTTP e HTTPS traffic data and subsequent collection into batches of the extracted URLs; b) detection of similar URLs, by metrics calculation based on the distance among URLs, namely based on a measure of the degree of diversity among pairs of character strings composing the URLs; c) activation of one or more clustering algorithms used to group the URLs based on the similarity metrics and to obtain, within each group of URLs, elements with similar/homogeneous features, adapted to be analyzed as a single entity; d) visualization of elements according to a sorting based on the degree of cohesion of the URLs contained in each grouping

    Collaboration vs. choreography conformance in BPMN

    Get PDF
    The BPMN 2.0 standard is a widely used semi-formal notation to model distributed information systems from different perspectives. The standard makes available a set of diagrams to represent such perspectives. Choreography diagrams represent global constraints concerning the interactions among system components without exposing their internal structure. Collaboration diagrams instead permit to depict the internal behaviour of a component, also referred as process, when integrated with others so to represent a possible implementation of the distributed system. This paper proposes a design methodology and a formal framework for checking conformance of choreographies against collaborations. In particular, the paper presents a direct formal operational semantics for both BPMN choreography and collaboration diagrams. Conformance aspects are proposed through two relations defined on top of the defined semantics. The approach benefits from the availability of a tool we have developed, named C4, that permits to experiment the theoretical framework in practical contexts. The objective here is to make the exploited formal methods transparent to system designers, thus fostering a wider adoption by practitioners

    A Survey on Big Data for Network Traffic Monitoring and Analysis

    Get PDF
    Network Traffic Monitoring and Analysis (NTMA) represents a key component for network management, especially to guarantee the correct operation of large-scale networks such as the Internet. As the complexity of Internet services and the volume of traffic continue to increase, it becomes difficult to design scalable NTMA applications. Applications such as traffic classification and policing require real-time and scalable approaches. Anomaly detection and security mechanisms require to quickly identify and react to unpredictable events while processing millions of heterogeneous events. At last, the system has to collect, store, and process massive sets of historical data for post-mortem analysis. Those are precisely the challenges faced by general big data approaches: Volume, Velocity, Variety, and Veracity. This survey brings together NTMA and big data. We catalog previous work on NTMA that adopt big data approaches to understand to what extent the potential of big data is being explored in NTMA. This survey mainly focuses on approaches and technologies to manage the big NTMA data, additionally briefly discussing big data analytics (e.g., machine learning) for the sake of NTMA. Finally, we provide guidelines for future work, discussing lessons learned, and research directions

    UMAP: Urban Mobility Analysis Platform to Harvest Car Sharing Data

    Get PDF
    Car sharing is nowadays a popular transport means in smart cities. In particular, the free-floating paradigm lets the users look for available cars, book one, and then start and stop the rental at their will, within the city area. This is done by using a smartphone app, which in turn contacts a web-based backend to exchange information. In this paper we present UMAP, a platform to harvest data freely made available on the web to extract driving habits in cities. We design UMAP to fetch data from car sharing platforms in real time, and process it to extract more advanced information about driving patterns and user’s habits while augmenting data with mapping and direction information fetched from other web platforms. This information is stored in a data lake where historical series are built, and later analyzed using easy to design and customize analytics modules. We prove the flexibility of UMAP by presenting a case of study for the city of Turin. We collect car sharing usage data over 50 days, and characterize both the temporal and spatial properties of rentals, as well as users’ habits in using the service, which we contrast with public transportation alternatives. Results provide insights about the driving style and needs, that are useful for smart city planners, and prove the feasibility of our approach

    A formal approach to decision support on Mobile Cloud Computing applications

    Get PDF
    Mobile Cloud Computing (MCC) is an emergent topic growths with the explosion of the mobile applications. In MCC systems, application functionalities are dynamically partitioned between the mobile devices and cloud infrastructures. The main research direction in this field aims at optimizing different metrics, like performance, energy efficiency, reliability and security, in a dynamic environment in which the MCC application is located. Optimization in MCC refers to taking advantages from the offloading process, that consists in moving the computation from the local device to a remote one. The biggest challenge in this aspect is to define a strategy that is able to decide when offloading and which part of the application to move. This technique, in general, improves the efficiency of a system, although sometimes it can lead to a performance degradation. To decide when and what to offload, in this thesis we propose a new general framework supporting the design and the runtime execution of applications on their own MCC scenarios. In particular the framework provides a new specification language, called MobiCa, equipped with a formal semantics that permits to capture all characteristics of a MCC system. Besides the strategy optimization achieved by exploiting the potentiality of the model checker UPPAAL, we propose a set of methods for determining optimal finite/infinite schedules. They are able to manage the resource assignment of components with the aim of improving the system efficiency in terms of battery consumption and time. Furthermore, we propose two optimized scheduling algorithms, developed in Java, based on the exploitation of parallel computation in order to improve the system performance

    Machine Learning and Big Data Approaches for Automatic Internet Monitoring

    No full text
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Clustering and evolutionary approach for longitudinal web traffic analysis

    No full text
    In recent years, data-driven approaches have attracted the interest of the research community. Considering network monitoring, unsupervised machine learning solutions such as clustering are particularly appealing to let the network analysts observe patterns, and track the evolution of traffic over time. In this paper, we present a novel unsupervised methodology to automatically process and analyze batches of HTTP traffic, looking just at the URL structure. First, we describe IDBSCAN, Iterative-DBSCAN. We design it to obtain well-shaped clusters, and to simplify the choice of parameters — often a cumbersome step for the network analyst. Second, we show LENTA, Longitudinal Exploration for Network Traffic Analysis, which allows to automatically observe the evolution over time of traffic, naturally highlighting trends and pinpointing anomalies. We first evaluate IDBSCAN and LENTA on synthetic data to compare their performance against well-known algorithms. Then we apply them on a real case, facing the analysis of hundred thousands of URLs collected from a live network. Results show both the goodness of clusters produced by IDBSCAN and LENTA ability to highlight changes in traffic, facilitating the analyst job
    corecore